This is a practice of RNA-seq pathway analysis using a small portion (6 out of 156) of GEO GSE135378 dataset. Differential expression analysis was done with DESeq2. Pathway analysis was performed for GO ontology and MSigDb gene sets.

Related paper: Pappalardi MB, et al. https://doi.org/10.1038/s43018-021-00249-x

The paper describes the discovery and systematic characterization of a novel small molecule DNA methyltransferase (DNMT) 1 inhibitor GSK3685032 and its anti-tumor activity in acute myeloid leukemia cell lines. Decitabine (DAC), a nucleoside analog and classic non-specific DNMT inhibitor, was used for comparison of the effects on DNA demethylation, transcription changes, DNA damage, and tumor inhibitory activity but not on specific gene expression.

In this version of exercise, we used Salmon to quantify original reads and obtained a count matrix.

Original pair-ended reads from leukemia-derived macrophage cell line MV4-11 cells treated with 400 nM decitabine (DAC) or GSK3685032 (GSK in short hereafter) for 4 days together with vehicle (DMSO) control were downloaded from NCBI SRA. Each treatment has two set of replicates, each of which includes six samples (six SRRs). A set of reads were combined in Slamon to obtain a single count file (quant.sf). Therefore, the result was six count files, two for each condition (DMSO, GSK032, and DAC). The reads appeared to have high base quality as assessed with fastQC (not shown) and were used for quantification without further processing (e.g., adapter removal). Salmon reported the alignment rates were between 50% - 60%, expect one DMSO sample (slightly over 70%).

SRA download and Salmon quantification was performed on Linux command line. See the repository for the scripts.

MSigDB gene sets were download manually from the website.

The results were consistent with broad effects of DNA demethylation. However, this is a technical practice not aiming to reproduce the original analyses or draw any biological insights.

Load the libraries.

library(AnnotationDbi)
library(org.Hs.eg.db)
library(EnsDb.Hsapiens.v86)
library(DESeq2)
library(tximport)
library(apeglm)
library(EDASeq)
library(limma)
library(clusterProfiler)
library(fgsea)
library(msigdbr)
library(ReactomePA)
library(vsn)
library(hexbin)
library(pheatmap)
library(dplyr)
library(readr)
library(stringr)
library(ggplot2)
library(RColorBrewer)
library(Glimma)
library(ggvenn)
library(plotly)

Data preparation

Prepare a tx2gene file

Gene names in quant.sf are in ENSEMBL format, so we use EnsemblDb to prepare the file.

edb = EnsDb.Hsapiens.v86
edb_transcripts = transcripts(edb, return.type = "DataFrame")
edb_tx2gene = edb_transcripts[, c("tx_name", "gene_id")]
#write.table(edb_tx2gene, "edb_tx2gene", row.names = FALSE, quote = FALSE)

Read in salmon counts and construct a DESeq dataset

We include batch in the design formula.

salmon_sample_info = read.table("salmon_sample_info.txt", header = TRUE)
rownames(salmon_sample_info) = salmon_sample_info$sample
salmon_sample_info$condition = factor(salmon_sample_info$condition)
salmon_sample_info$batch = str_split_i(salmon_sample_info$sample, "_", 2)
salmon_sample_info$batch = as.factor(salmon_sample_info$batch)

quant_files = file.path(getwd(), "pappa-salmon-sf", paste0(salmon_sample_info$sample, "_quant.sf"))
names(quant_files) <- salmon_sample_info$sample
txi = tximport(quant_files, type="salmon", tx2gene=edb_tx2gene, ignoreTxVersion = TRUE)
## reading in files with read_tsv
## 1 2 3 4 5 6 
## transcripts missing from tx2gene: 191979
## summarizing abundance
## summarizing counts
## summarizing length
# Construct DESeq dataset
ddsTxi <- DESeqDataSetFromTximport(txi,
                                   colData = salmon_sample_info,
                                   design = ~ batch + condition)
## using counts and average transcript lengths from tximport

Filtering read count

Filter the counts, removing low counts as recommended by the authors of DESeq. We include rows with at least two samples having count >= 10.

# We have 3 conditions, each with two replicates. 
# Note that original authors set a much looser criterion (>= 2 reads) even with more samples.
keep = rowSums(counts(ddsTxi) >= 10) >= 2
sum(keep)  # 21700
## [1] 17757
# Total rows: 39376, so 39376 - 21700 = 17676 removed.
ddsTxi = ddsTxi[keep, ]

Differential expression analysis

DESeq task

# Tell DESeq the reference level (control).
# First check the condition.
print(ddsTxi$condition)
## [1] DMSO DMSO GSK  GSK  DAC  DAC 
## Levels: DAC DMSO GSK
print(ddsTxi$batch)
## [1] 1 2 1 2 1 2
## Levels: 1 2
# Set the condition.
ddsTxi$condition= factor(ddsTxi$condition, levels = c("DMSO", "DAC", "GSK"))
print(ddsTxi$condition)
## [1] DMSO DMSO GSK  GSK  DAC  DAC 
## Levels: DMSO DAC GSK
# Generate the log fold change table.
ddsTxi = DESeq(ddsTxi)
## estimating size factors
## using 'avgTxLength' from assays(dds), correcting for library size
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
print(ddsTxi)
## class: DESeqDataSet 
## dim: 17757 6 
## metadata(1): version
## assays(6): counts avgTxLength ... H cooks
## rownames(17757): ENSG00000000419 ENSG00000000457 ... ENSG00000283689
##   ENSG00000283696
## rowData names(30): baseMean baseVar ... deviance maxCooks
## colnames(6): DMSO_1 DMSO_2 ... DAC_1 DAC_2
## colData names(4): sample condition SRR_number batch

Get the results for GSK and DAC.

Significance level set to 0.05 as that from original authors.

res_GSK = results(ddsTxi, contrast = c("condition", "GSK", "DMSO"), alpha = 0.05)
# Sort the rows by fold change.
res_GSK = res_GSK[order(res_GSK$log2FoldChange, decreasing = TRUE), ]
print(summary(res_GSK))
## 
## out of 17757 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up)       : 3188, 18%
## LFC < 0 (down)     : 3296, 19%
## outliers [1]       : 0, 0%
## low counts [2]     : 0, 0%
## (mean count < 3)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
## 
## NULL
res_DAC = results(ddsTxi, contrast = c("condition", "DAC", "DMSO"), alpha = 0.05)
# Sort the rows by fold change.
res_DAC = res_DAC[order(res_DAC$log2FoldChange, decreasing = TRUE), ]
print(summary(res_DAC))
## 
## out of 17757 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up)       : 3423, 19%
## LFC < 0 (down)     : 3595, 20%
## outliers [1]       : 0, 0%
## low counts [2]     : 0, 0%
## (mean count < 3)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
## 
## NULL

Log fold change shrinkage

For visualization and ranking according to DESeq authors.

lfcShrink_GSK = lfcShrink(ddsTxi, coef = "condition_GSK_vs_DMSO", type = "apeglm")
lfcShrink_DAC = lfcShrink(ddsTxi, coef = "condition_DAC_vs_DMSO", type = "apeglm")

Quality plots

We look at various plots for data quality.

MA-plot

Most of the points (genes) are expected to be around the horizontal 0 line. True.

DESeq2::plotMA(lfcShrink_GSK, ylim = c(-3, 3))

DESeq2::plotMA(lfcShrink_DAC, ylim = c(-3, 3))

P-value distribution

Show expected results.

par(mfrow = c(1, 2))

ggplot(res_GSK, aes(x = pvalue)) + geom_histogram(bins = 100) +
    ggtitle("Distribution of raw p values") +
    theme_classic()

ggplot(res_DAC, aes(x = padj)) + geom_histogram(bins = 100) +
        ggtitle("Distribution of raw p values") + 
        theme_classic()

Raw and normalized counts

Relative log expression plots. Normalization worked well.

EDASeq::plotRLE(counts(ddsTxi), 
        outline = FALSE, 
        col = colData(ddsTxi)$condition,
        ylim = c(-3, 3),
        main = "Raw Counts")

EDASeq::plotRLE(counts(ddsTxi, normalized = TRUE), 
        outline = FALSE, 
        col = colData(ddsTxi)$condition,
        ylim = c(-3, 3),
        main = "Normalized Counts")

Heatmaps of transformed count matrix

Replicates have variations with a pattern that seems to suggest batch effect, but overall they were clustered together.

Normalized counts.

select = order(rowMeans(counts(ddsTxi, normalized = TRUE)), decreasing = TRUE)[1:100]
col_df <- data.frame(colData(ddsTxi))["condition"]
pheatmap(assay(normTransform(ddsTxi))[select, ], scale = "row", 
         cluster_rows = FALSE, show_rownames = FALSE,
         annotation_col = col_df)

Counts with regularized log transformation(rlog).

rlogT = rlog(ddsTxi)
pheatmap(assay(rlogT)[select, ], scale = "row", 
         cluster_rows = FALSE, show_rownames = FALSE,
         annotation_col = col_df)

Counts with variance_stabilizing transformation(vst).

vsd = vst(ddsTxi)
pheatmap(assay(vsd)[select, ], scale = "row", 
         cluster_rows = FALSE, show_rownames = FALSE,
         annotation_col = col_df)

Use limma removeBatchBatchEffect to see if batch effect indeed exists.

Hard to tell here, but clear in PCA plot (below).

vsd2 = vsd
mat = assay(vsd2)
mmatrix = model.matrix(~ condition, colData(vsd2))
mat = removeBatchEffect(mat, batch = vsd2$batch, design = mmatrix)
assay(vsd2) = mat
pheatmap(assay(vsd2)[select, ], scale = "row", 
         cluster_rows = FALSE, show_rownames = FALSE,
         annotation_col = col_df)

Count PCA

Groups are well separated and replicates are close to each other. Also the variance was well captured by the first two components.

With rlog-transformation.

plotPCA(rlogT, ntop = 500, intgroup = "condition") + 
    theme_classic()

With vst.

plotPCA(vst(ddsTxi), ntop = 500, intgroup = "condition") + 
    theme_classic()

Use vst-transformed and limma-processed data fro PCA. Now the replicates are much closer for control and GSK compared to unprocessed data above.

plotPCA(vsd2, ntop = 500, intgroup = "condition") + 
    theme_classic()

sample-to-sample distance

Replicates are closer.

sampleDists = dist(t(assay(rlogT)))
sampleDistMatrix = as.matrix(sampleDists)
rownames(sampleDistMatrix) = paste(rlogT$condition, rlogT$batch, sep = "-")
colnames(sampleDistMatrix) = NULL
colors <- colorRampPalette(rev(brewer.pal(9, "Blues")))(255)
pheatmap(sampleDistMatrix,
         clustering_distance_rows = sampleDists,
         clustering_distance_cols = sampleDists, 
         cols = colors)

Dispersion plot

Show typical result. Overall good data quality.

plotDispEsts(ddsTxi)

Annotation

Add gene symbol and name to DE results. Add EntrezID for downstream analysis.

annot_result = function(res) {
    res$Symbol = mapIds(org.Hs.eg.db,
                        keys = row.names(res),
                        column = "SYMBOL",
                        keytype = "ENSEMBL",
                        multiVals = "first")
    
    res$GeneName = mapIds(org.Hs.eg.db,
                        keys = row.names(res),
                        column = "GENENAME",
                        keytype = "ENSEMBL",
                        multiVals = "first")

    res$EntrezID = mapIds(org.Hs.eg.db,
                        keys = row.names(res),
                        column = "ENTREZID",
                        keytype = "ENSEMBL",
                        multiVals = "first")
    res = res[, c(7, 8, 9, 1:6)]
    return(res)
}
res_GSK = annot_result(res_GSK)
res_DAC = annot_result(res_DAC)
res_GSK = res_GSK[!is.na(res_GSK$Symbol), ]
res_DAC = res_DAC[!is.na(res_DAC$Symbol), ]

`

Top 10 significantly (padj < 0.05) up-regulated genes.

print(res_GSK[res_GSK$padj < 0.05, ][1:10, c("Symbol", "GeneName", "log2FoldChange", "pvalue", "padj")])
## log2 fold change (MLE): condition GSK vs DMSO 
## Wald test p-value: condition GSK vs DMSO 
## DataFrame with 10 rows and 5 columns
##                      Symbol               GeneName log2FoldChange      pvalue
##                 <character>            <character>      <numeric>   <numeric>
## ENSG00000124721       DNAH8 dynein axonemal heav..        12.4850 1.20972e-17
## ENSG00000203989     RHOXF2B Rhox homeobox family..        11.9507 1.38415e-18
## ENSG00000184735       DDX53   DEAD-box helicase 53        11.7740 1.06183e-15
## ENSG00000075886      TUBA3D       tubulin alpha 3d        11.3790 1.10250e-14
## ENSG00000198028      ZNF560 zinc finger protein ..        11.2614 2.21067e-14
## ENSG00000180113       TDRD6 tudor domain contain..        11.1221 5.67694e-17
## ENSG00000229807        XIST X inactive specific ..        11.0933 1.50093e-14
## ENSG00000166049       PASD1 PAS domain containin..        11.0438 8.10424e-14
## ENSG00000242389      RBMY1E RNA binding motif pr..        11.0363 3.23167e-13
## ENSG00000198681      MAGEA1  MAGE family member A1        10.9451 1.64583e-13
##                        padj
##                   <numeric>
## ENSG00000124721 3.59816e-16
## ENSG00000203989 4.46879e-17
## ENSG00000184735 2.62604e-14
## ENSG00000075886 2.47811e-13
## ENSG00000198028 4.83434e-13
## ENSG00000180113 1.59250e-15
## ENSG00000229807 3.31906e-13
## ENSG00000166049 1.65220e-12
## ENSG00000242389 6.17040e-12
## ENSG00000198681 3.22928e-12
print(res_DAC[res_DAC$padj < 0.05, ][1:10, c("Symbol", "GeneName", "log2FoldChange", "pvalue", "padj")])
## log2 fold change (MLE): condition DAC vs DMSO 
## Wald test p-value: condition DAC vs DMSO 
## DataFrame with 10 rows and 5 columns
##                      Symbol               GeneName log2FoldChange      pvalue
##                 <character>            <character>      <numeric>   <numeric>
## ENSG00000124721       DNAH8 dynein axonemal heav..        12.1772 7.33137e-17
## ENSG00000075886      TUBA3D       tubulin alpha 3d        11.2410 2.26692e-14
## ENSG00000203989     RHOXF2B Rhox homeobox family..        11.2166 1.48171e-16
## ENSG00000244398   RPL36AP37 ribosomal protein L3..        11.2080 3.75719e-03
## ENSG00000184735       DDX53   DEAD-box helicase 53        11.1359 3.33346e-14
## ENSG00000136931       NR5A1 nuclear receptor sub..        11.1126 6.35778e-14
## ENSG00000198028      ZNF560 zinc finger protein ..        10.8640 1.71914e-13
## ENSG00000166049       PASD1 PAS domain containin..        10.5842 8.16395e-13
## ENSG00000122728       TAF1L TATA-box binding pro..        10.3839 2.65262e-12
## ENSG00000268606      MAGEA2  MAGE family member A2        10.1495 9.08069e-12
##                        padj
##                   <numeric>
## ENSG00000124721 1.68850e-15
## ENSG00000075886 4.14559e-13
## ENSG00000203989 3.34316e-15
## ENSG00000244398 1.18166e-02
## ENSG00000184735 5.97901e-13
## ENSG00000136931 1.09927e-12
## ENSG00000198028 2.85030e-12
## ENSG00000166049 1.25513e-11
## ENSG00000122728 3.85770e-11
## ENSG00000268606 1.24514e-10
print("Number of genes common to both drugs:")
## [1] "Number of genes common to both drugs:"
print(sum(res_GSK[1:10, ]$Symbol %in% res_DAC[1:10, ]$Symbol))
## [1] 6

Top 10 significantly (padj < 0.05) down-regulated genes.

print(tail(res_GSK[res_GSK$padj < 0.05, ], 10)[, c("Symbol", "GeneName", "log2FoldChange", "pvalue", "padj")])
## log2 fold change (MLE): condition GSK vs DMSO 
## Wald test p-value: condition GSK vs DMSO 
## DataFrame with 10 rows and 5 columns
##                      Symbol               GeneName log2FoldChange      pvalue
##                 <character>            <character>      <numeric>   <numeric>
## ENSG00000197253       TPSB2        tryptase beta 2       -5.23540 1.03105e-02
## ENSG00000277734        TRAC T cell receptor alph..       -5.96373 4.10806e-04
## ENSG00000114374       USP9Y ubiquitin specific p..       -6.08607 2.93758e-03
## ENSG00000165702       GFI1B growth factor indepe..       -6.31453 1.99860e-03
## ENSG00000163815      CLEC3B C-type lectin domain..       -6.38091 6.29531e-04
## ENSG00000187627       RGPD1 RANBP2 like and GRIP..       -6.77704 3.94394e-04
## ENSG00000251002     TRD-AS1    TRD antisense RNA 1       -6.93087 1.40840e-03
## ENSG00000067842      ATP2B3 ATPase plasma membra..       -7.20292 1.59548e-04
## ENSG00000187730       GABRD gamma-aminobutyric a..       -7.30940 1.26324e-04
## ENSG00000270757  HSPE1-MOB4 HSPE1-MOB4 readthrough       -9.32545 2.81333e-08
##                        padj
##                   <numeric>
## ENSG00000197253 3.05394e-02
## ENSG00000277734 1.79099e-03
## ENSG00000114374 1.03170e-02
## ENSG00000165702 7.35375e-03
## ENSG00000163815 2.63026e-03
## ENSG00000187627 1.72664e-03
## ENSG00000251002 5.39220e-03
## ENSG00000067842 7.71958e-04
## ENSG00000187730 6.27273e-04
## ENSG00000270757 2.71315e-07
print(tail(res_DAC[res_DAC$padj < 0.05, ], 10)[, c("Symbol", "GeneName", "log2FoldChange", "pvalue", "padj")])
## log2 fold change (MLE): condition DAC vs DMSO 
## Wald test p-value: condition DAC vs DMSO 
## DataFrame with 10 rows and 5 columns
##                       Symbol               GeneName log2FoldChange      pvalue
##                  <character>            <character>      <numeric>   <numeric>
## ENSG00000235823     OLMALINC oligodendrocyte matu..       -5.25509 1.62390e-02
## ENSG00000278932 LOC105379428 uncharacterized LOC1..       -6.03655 1.24608e-02
## ENSG00000144821        MYH15  myosin heavy chain 15       -6.53909 3.42882e-03
## ENSG00000231007      CDC20P1 cell division cycle ..       -6.73600 1.20386e-03
## ENSG00000214967       NPIPA7 nuclear pore complex..       -7.03997 7.59299e-05
## ENSG00000228705    LINC00659 long intergenic non-..       -7.06344 8.30789e-03
## ENSG00000247095     MIR210HG       MIR210 host gene       -7.15559 7.95339e-05
## ENSG00000260442   ATP2A1-AS1 ATP2A1 antisense RNA 1       -7.37941 6.12512e-03
## ENSG00000167912       TOX-DT TOX divergent transc..       -7.45650 2.25770e-03
## ENSG00000174010       KLHL15 kelch like family me..      -13.18645 5.85323e-03
##                        padj
##                   <numeric>
## ENSG00000235823 0.042330630
## ENSG00000278932 0.033530282
## ENSG00000144821 0.010936683
## ENSG00000231007 0.004315961
## ENSG00000214967 0.000358873
## ENSG00000228705 0.023582795
## ENSG00000247095 0.000374115
## ENSG00000260442 0.018126665
## ENSG00000167912 0.007564161
## ENSG00000174010 0.017412604
print("Number of genes common to both drugs:")
## [1] "Number of genes common to both drugs:"
print(sum(tail(res_GSK[res_GSK$padj < 0.05, ], 10)$Symbol %in% 
          tail(res_DAC[res_DAC$padj < 0.05,], 10)$Symbol))
## [1] 0

Save the files.

write.table(res_GSK, "MV411_day4_GSK032_400nM_salmon_lfc.tsv",
            sep = "\t", quote = FALSE)
write.table(res_DAC, "MV411_day4_DAC_400nM_salmon_lfc.tsv",
            sep = "\t", quote = FALSE)

Interactive Volcano plot

Hover the mouse over a point to see the gene symbol, fold change and -log10(padj).

plotVolcano = function(res_DE, condition) {
                      ggplot(res_DE,
                             aes(x = log2FoldChange,
                                 y = -log10(padj),
                                 label = Symbol,
                                 color = ifelse(res_DE$padj < 0.05 & abs(res_DE$log2FoldChange) > 1,
                                                    "yes", "not"))) +
                      geom_point() + 
                      labs(title = paste0(condition, " vs control"),
                          xlab = "log2 fold change", 
                          ylab = "-log10 adjusted p-value",
                          color = "> 2-fold & padj < 0.05") +
                      scale_color_manual(values = c("#56B4E9", "#D55E00")) +
                      theme(plot.title = element_text(size = 12, hjust = 0.5),
                            axis.title = element_text(size = 12),
                            legend.position = "right")
}                        
plotGSK = plotVolcano(res_GSK, "GSK")
ggplotly(plotGSK, tooltip = c("log2FoldChange", "-log10(padj)", "label"))
plotDAC = plotVolcano(res_DAC, "DAC")
ggplotly(plotDAC, tooltip = c("log2FoldChange", "-log10(padj)", "label"))

Venn diagrams for up- and down-regulated genes

There were more up-regulated genes than down-regulated genes.

par(mfrow = c(1, 2))

# Select genes with at least 2-fold changes and sort them decreasingly.
DE_gene_GSK = res_GSK[abs(res_GSK$log2FoldChange) >= 1 & res_GSK$padj < 0.05, ]
DE_gene_GSK = DE_gene_GSK[, c("Symbol", "EntrezID", "log2FoldChange")]
DE_gene_GSK = DE_gene_GSK[!is.na(DE_gene_GSK$Symbol), ]
DE_gene_GSK = DE_gene_GSK[order(-DE_gene_GSK$log2FoldChange), ]
print(nrow(DE_gene_GSK))  # 3435 genes
## [1] 2086
DE_gene_DAC = res_DAC[abs(res_DAC$log2FoldChange) >= 1 & res_DAC$padj < 0.05, ]
DE_gene_DAC = DE_gene_DAC[, c("Symbol",  "EntrezID", "log2FoldChange")]
DE_gene_DAC = DE_gene_DAC[!is.na(DE_gene_DAC$Symbol), ]
DE_gene_DAC = DE_gene_DAC[order(-DE_gene_DAC$log2FoldChange), ]
print(nrow(DE_gene_DAC))  # 4015 genes
## [1] 2449
up_list = list(GSK = as.vector(DE_gene_GSK[DE_gene_GSK$log2FoldChange >= 1, ]$Symbol),
               DAC = as.vector(DE_gene_DAC[DE_gene_DAC$log2FoldChange >= 1, ]$Symbol))
print(sapply(names(up_list), function(x) length(up_list[[x]])))
##  GSK  DAC 
## 1559 1846
down_list = list(GSK = as.vector(DE_gene_GSK[DE_gene_GSK$log2FoldChange < 1, ]$Symbol),
                 DAC = as.vector(DE_gene_DAC[DE_gene_DAC$log2FoldChange < 1, ]$Symbol))
print(sapply(names(up_list), function(x) length(down_list[[x]])))
## GSK DAC 
## 527 603
ggvenn(up_list, 
       fill_color = c("white", "white"),
       stroke_size = 0.5,
       set_name_size = 6) + 
       ggtitle("Up-regulated genes") +
       theme(plot.title = element_text(size = 16, hjust = 0.5))

ggvenn(down_list, 
       fill_color = c("white", "white"),
       stroke_size = 0.5,
       set_name_size = 6) + 
       ggtitle("Down-regulated genes") +
       theme(plot.title = element_text(size = 16, hjust = 0.5))

Gene enrichment analysis

GO ontology

Use clusterProfiler package. This package runs fgesa algorithm by default.

The following results appeared to indicate changes involving molecular interactions and cellular structures. The similarity in “Biological process” invoked by the two drugs was prominent.

GSK

goPath = function(ont, gene) {
                enrichGO(gene = gene,
                        OrgDb = org.Hs.eg.db,
                        keyType = "SYMBOL",
                        ont = ont,
                        pAdjustMethod = "BH",
                        pvalueCutoff = 0.01,
                        qvalueCutoff = 0.05)
}

# Molecular functions
goGSK_MF = goPath("MF", DE_gene_GSK$Symbol)
print(head(goGSK_MF$Description, 10))
## [1] "structural constituent of chromatin"  
## [2] "protein heterodimerization activity"  
## [3] "MHC class II protein complex binding" 
## [4] "single-stranded DNA helicase activity"
## [5] "cargo receptor activity"              
## [6] "MHC protein complex binding"          
## [7] "DNA replication origin binding"       
## [8] "cytoskeletal motor activity"
# Biological process
goGSK_BP = goPath("BP", DE_gene_GSK$Symbol)
print(head(goGSK_BP$Description, 10))
##  [1] "nucleosome assembly"          "nucleosome organization"     
##  [3] "protein-DNA complex assembly" "meiotic cell cycle"          
##  [5] "nuclear division"             "meiotic cell cycle process"  
##  [7] "meiosis I cell cycle process" "meiotic nuclear division"    
##  [9] "organelle fission"            "meiosis I"

DAC

# Molecular functions
goDAC_MF = goPath("MF", DE_gene_DAC$Symbol)
print(head(goDAC_MF$Description, 10))
## [1] "structural constituent of chromatin" 
## [2] "protein heterodimerization activity" 
## [3] "MHC class II protein complex binding"
## [4] "MHC protein complex binding"         
## [5] "peptide antigen binding"             
## [6] "vitamin B6 binding"                  
## [7] "cytoskeletal motor activity"
# Biological process
goDAC_BP = goPath("BP", DE_gene_DAC$Symbol)
print(head(goDAC_BP$Description, 10))
##  [1] "nucleosome assembly"                                
##  [2] "nucleosome organization"                            
##  [3] "nuclear division"                                   
##  [4] "protein-DNA complex assembly"                       
##  [5] "meiotic nuclear division"                           
##  [6] "meiotic cell cycle"                                 
##  [7] "organelle fission"                                  
##  [8] "meiotic cell cycle process"                         
##  [9] "protein localization to CENP-A containing chromatin"
## [10] "chromosome segregation"

plot the “Biological Process” network.

goplot(goGSK_BP, showCategory = 5)

goplot(goDAC_BP, showCategory = 5)

MSigDb gene set enrichment analysis

gesaPath = function(gene_set, DE_genes, nPermSimple = 10000) {
                pathways = gmtPathways(gene_set)
                ranks = DE_genes$log2FoldChange
                names(ranks) = DE_genes$EntrezID
                gesa_result = fgsea(pathways = pathways, 
                                    stats = ranks,
                                    minSize = 15,
                                    maxSize = 500,
                                    nPermSimple = nPermSimple)
                return(gesa_result)
}

GSK

Curated (C2) pathways.

Result includes Many cancer- and methylation-related pathways.

c2_gene_set = "gmt-file/c2.all.v2024.1.Hs.entrez.gmt"  # Path to gmt file
c2_GSK = gesaPath(c2_gene_set, DE_gene_GSK)

c2_top10_up = c2_GSK[ES > 0][head(order(pval), 10), pathway]
cat("Top ten up-regulated pathways:\n")
## Top ten up-regulated pathways:
print(c2_top10_up)
##  [1] "ACEVEDO_METHYLATED_IN_LIVER_CANCER_DN"   
##  [2] "MIKKELSEN_MEF_HCP_WITH_H3_UNMETHYLATED"  
##  [3] "YEGNASUBRAMANIAN_PROSTATE_CANCER"        
##  [4] "JAEGER_METASTASIS_UP"                    
##  [5] "MCCABE_BOUND_BY_HOXC6"                   
##  [6] "ZHONG_RESPONSE_TO_AZACITIDINE_AND_TSA_UP"
##  [7] "MIKKELSEN_IPS_WITH_HCP_H3K27ME3"         
##  [8] "KIM_RESPONSE_TO_TSA_AND_DECITABINE_UP"   
##  [9] "RICKMAN_HEAD_AND_NECK_CANCER_B"          
## [10] "ACEVEDO_LIVER_CANCER_WITH_H3K27ME3_UP"
c2_top10_down = c2_GSK[ES < 0][head(order(pval), 10), pathway]
cat("Top ten down-regulated pathways:\n")
## Top ten down-regulated pathways:
print(c2_top10_down)
##  [1] "WINNEPENNINCKX_MELANOMA_METASTASIS_UP"    
##  [2] "MORI_IMMATURE_B_LYMPHOCYTE_DN"            
##  [3] "CROONQUIST_NRAS_SIGNALING_DN"             
##  [4] "KAUFFMANN_DNA_REPAIR_GENES"               
##  [5] "ZHOU_CELL_CYCLE_GENES_IN_IR_RESPONSE_24HR"
##  [6] "ZHOU_CELL_CYCLE_GENES_IN_IR_RESPONSE_6HR" 
##  [7] "BENPORATH_PROLIFERATION"                  
##  [8] "PUJANA_XPRSS_INT_NETWORK"                 
##  [9] "REACTOME_HCMV_LATE_EVENTS"                
## [10] "REACTOME_HDACS_DEACETYLATE_HISTONES"
rank_GSK = DE_gene_GSK$log2FoldChange
names(rank_GSK) = DE_gene_GSK$EntrezID
c2_pathways = gmtPathways(c2_gene_set)

plotEnrichment(c2_pathways[["ZHONG_RESPONSE_TO_AZACITIDINE_AND_TSA_UP"]], rank_GSK) +
            labs(title = "RESPONSE_TO_AZACITIDINE_AND_TSA_UP genes")

Hallmark pathways.

h_gene_set = "gmt-file/h.all.v2024.1.Hs.entrez.gmt"
h_GSK = gesaPath(h_gene_set, DE_gene_GSK)

h_top10_up = h_GSK[ES > 0][head(order(pval), 10), pathway]
cat("Top ten up-regulated pathways:\n")
## Top ten up-regulated pathways:
print(h_top10_up)
##  [1] "HALLMARK_SPERMATOGENESIS"           "HALLMARK_KRAS_SIGNALING_DN"        
##  [3] "HALLMARK_MYOGENESIS"                "HALLMARK_ADIPOGENESIS"             
##  [5] "HALLMARK_INTERFERON_ALPHA_RESPONSE" "HALLMARK_HYPOXIA"                  
##  [7] "HALLMARK_FATTY_ACID_METABOLISM"     "HALLMARK_INFLAMMATORY_RESPONSE"    
##  [9] "HALLMARK_XENOBIOTIC_METABOLISM"     "HALLMARK_COAGULATION"
h_top10_down = h_GSK[ES < 0][head(order(pval), 10), pathway]
cat("Top ten down-regulated pathways:\n")
## Top ten down-regulated pathways:
print(h_top10_down)
##  [1] "HALLMARK_MYC_TARGETS_V1"          "HALLMARK_MTORC1_SIGNALING"       
##  [3] "HALLMARK_APOPTOSIS"               "HALLMARK_DNA_REPAIR"             
##  [5] "HALLMARK_P53_PATHWAY"             "HALLMARK_CHOLESTEROL_HOMEOSTASIS"
##  [7] "HALLMARK_IL2_STAT5_SIGNALING"     "HALLMARK_TNFA_SIGNALING_VIA_NFKB"
##  [9] "HALLMARK_APICAL_JUNCTION"         "HALLMARK_UV_RESPONSE_UP"

Reactome pathways.

reactome_set = "gmt-file/c2.cp.reactome.v2024.1.Hs.entrez.gmt"
react_GSK = gesaPath(reactome_set, DE_gene_GSK)

react_top10_up = react_GSK[ES > 0][head(order(pval), 10), pathway]
cat("Top ten up-regulated pathways:\n")
## Top ten up-regulated pathways:
print(react_top10_up)
##  [1] "REACTOME_NEUROTRANSMITTER_RECEPTORS_AND_POSTSYNAPTIC_SIGNAL_TRANSMISSION"                                                        
##  [2] "REACTOME_ORGANELLE_BIOGENESIS_AND_MAINTENANCE"                                                                                   
##  [3] "REACTOME_NEURONAL_SYSTEM"                                                                                                        
##  [4] "REACTOME_O_LINKED_GLYCOSYLATION"                                                                                                 
##  [5] "REACTOME_ASPARAGINE_N_LINKED_GLYCOSYLATION"                                                                                      
##  [6] "REACTOME_TRANSMISSION_ACROSS_CHEMICAL_SYNAPSES"                                                                                  
##  [7] "REACTOME_ANTIVIRAL_MECHANISM_BY_IFN_STIMULATED_GENES"                                                                            
##  [8] "REACTOME_POTASSIUM_CHANNELS"                                                                                                     
##  [9] "REACTOME_REGULATION_OF_INSULIN_LIKE_GROWTH_FACTOR_IGF_TRANSPORT_AND_UPTAKE_BY_INSULIN_LIKE_GROWTH_FACTOR_BINDING_PROTEINS_IGFBPS"
## [10] "REACTOME_MHC_CLASS_II_ANTIGEN_PRESENTATION"
react_top10_down = react_GSK[ES < 0][head(order(pval), 10), pathway]
cat("Top ten down-regulated pathways:\n")
## Top ten down-regulated pathways:
print(react_top10_down)
##  [1] "REACTOME_S_PHASE"                                                                   
##  [2] "REACTOME_HCMV_LATE_EVENTS"                                                          
##  [3] "REACTOME_HDACS_DEACETYLATE_HISTONES"                                                
##  [4] "REACTOME_HATS_ACETYLATE_HISTONES"                                                   
##  [5] "REACTOME_DNA_DOUBLE_STRAND_BREAK_REPAIR"                                            
##  [6] "REACTOME_HOMOLOGY_DIRECTED_REPAIR"                                                  
##  [7] "REACTOME_DISEASES_OF_PROGRAMMED_CELL_DEATH"                                         
##  [8] "REACTOME_SYNTHESIS_OF_DNA"                                                          
##  [9] "REACTOME_RUNX1_REGULATES_TRANSCRIPTION_OF_GENES_INVOLVED_IN_DIFFERENTIATION_OF_HSCS"
## [10] "REACTOME_SENESCENCE_ASSOCIATED_SECRETORY_PHENOTYPE_SASP"
react_pathways = gmtPathways(reactome_set)

plotEnrichment(react_pathways[["REACTOME_HDACS_DEACETYLATE_HISTONES"]], rank_GSK) +
            labs(title = "REACTOME_HDACS_DEACETYLATE_HISTONES genes")

DAC

Curated (C2) pathways.

Similar response to GSK.

c2_gene_set = "gmt-file/c2.all.v2024.1.Hs.entrez.gmt"  # Path to gmt file
c2_DAC = gesaPath(c2_gene_set, DE_gene_DAC)

c2_top10_up = c2_DAC[ES > 0][head(order(pval), 10), pathway]
cat("Top ten up-regulated pathways:\n")
## Top ten up-regulated pathways:
print(c2_top10_up)
##  [1] "ACEVEDO_METHYLATED_IN_LIVER_CANCER_DN"   
##  [2] "YEGNASUBRAMANIAN_PROSTATE_CANCER"        
##  [3] "MIKKELSEN_MEF_HCP_WITH_H3_UNMETHYLATED"  
##  [4] "MADAN_DPPA4_TARGETS"                     
##  [5] "MUELLER_METHYLATED_IN_GLIOBLASTOMA"      
##  [6] "PURBEY_TARGETS_OF_CTBP1_NOT_SATB1_DN"    
##  [7] "RICKMAN_HEAD_AND_NECK_CANCER_B"          
##  [8] "ZHONG_RESPONSE_TO_AZACITIDINE_AND_TSA_UP"
##  [9] "MIKKELSEN_IPS_WITH_HCP_H3K27ME3"         
## [10] "NOURUZI_NEPC_ASCL1_TARGETS"
c2_top10_down = c2_DAC[ES < 0][head(order(pval), 10), pathway]
cat("Top ten down-regulated pathways:\n")
## Top ten down-regulated pathways:
print(c2_top10_down)
##  [1] "CROONQUIST_IL6_DEPRIVATION_DN"            
##  [2] "WINNEPENNINCKX_MELANOMA_METASTASIS_UP"    
##  [3] "ZHOU_CELL_CYCLE_GENES_IN_IR_RESPONSE_24HR"
##  [4] "PUJANA_XPRSS_INT_NETWORK"                 
##  [5] "BENPORATH_PROLIFERATION"                  
##  [6] "WHITEFORD_PEDIATRIC_CANCER_MARKERS"       
##  [7] "REACTOME_DNA_REPLICATION_PRE_INITIATION"  
##  [8] "MORI_IMMATURE_B_LYMPHOCYTE_DN"            
##  [9] "REACTOME_HDACS_DEACETYLATE_HISTONES"      
## [10] "CROONQUIST_NRAS_SIGNALING_DN"
rank_DAC = DE_gene_DAC$log2FoldChange
names(rank_DAC) = DE_gene_DAC$EntrezID

plotEnrichment(c2_pathways[["ZHONG_RESPONSE_TO_AZACITIDINE_AND_TSA_UP"]], rank_DAC) +
            labs(title = "RESPONSE_TO_AZACITIDINE_AND_TSA_UP genes")

Hallmark pathways.

h_gene_set = "gmt-file/h.all.v2024.1.Hs.entrez.gmt"
h_DAC = gesaPath(h_gene_set, DE_gene_DAC)

h_top10_up = h_DAC[ES > 0][head(order(pval), 10), pathway]
cat("Top ten up-regulated pathways:\n")
## Top ten up-regulated pathways:
print(h_top10_up)
##  [1] "HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION"
##  [2] "HALLMARK_HEME_METABOLISM"                  
##  [3] "HALLMARK_KRAS_SIGNALING_DN"                
##  [4] "HALLMARK_ALLOGRAFT_REJECTION"              
##  [5] "HALLMARK_COMPLEMENT"                       
##  [6] "HALLMARK_KRAS_SIGNALING_UP"                
##  [7] "HALLMARK_MYOGENESIS"                       
##  [8] "HALLMARK_COAGULATION"                      
##  [9] "HALLMARK_HYPOXIA"                          
## [10] "HALLMARK_INFLAMMATORY_RESPONSE"
h_top10_down = h_DAC[ES < 0][head(order(pval), 10), pathway]
cat("Top ten down-regulated pathways:\n")
## Top ten down-regulated pathways:
print(h_top10_down)
##  [1] "HALLMARK_MYC_TARGETS_V1"            "HALLMARK_MTORC1_SIGNALING"         
##  [3] "HALLMARK_CHOLESTEROL_HOMEOSTASIS"   "HALLMARK_MYC_TARGETS_V2"           
##  [5] "HALLMARK_UNFOLDED_PROTEIN_RESPONSE" "HALLMARK_DNA_REPAIR"               
##  [7] "HALLMARK_TNFA_SIGNALING_VIA_NFKB"   "HALLMARK_ANDROGEN_RESPONSE"        
##  [9] "HALLMARK_MITOTIC_SPINDLE"           "HALLMARK_FATTY_ACID_METABOLISM"

Reactome pathways.

reactome_set = "gmt-file/c2.cp.reactome.v2024.1.Hs.entrez.gmt"
react_DAC = gesaPath(reactome_set, DE_gene_DAC)

react_top10_up = react_DAC[ES > 0][head(order(pval), 10), pathway]
cat("Top ten up-regulated pathways:\n")
## Top ten up-regulated pathways:
print(react_top10_up)
##  [1] "REACTOME_NEUROTRANSMITTER_RECEPTORS_AND_POSTSYNAPTIC_SIGNAL_TRANSMISSION"
##  [2] "REACTOME_INTRA_GOLGI_AND_RETROGRADE_GOLGI_TO_ER_TRAFFIC"                 
##  [3] "REACTOME_CILIUM_ASSEMBLY"                                                
##  [4] "REACTOME_PHASE_I_FUNCTIONALIZATION_OF_COMPOUNDS"                         
##  [5] "REACTOME_ACTIVATION_OF_NMDA_RECEPTORS_AND_POSTSYNAPTIC_EVENTS"           
##  [6] "REACTOME_TRANSMISSION_ACROSS_CHEMICAL_SYNAPSES"                          
##  [7] "REACTOME_DISEASES_ASSOCIATED_WITH_O_GLYCOSYLATION_OF_PROTEINS"           
##  [8] "REACTOME_POTASSIUM_CHANNELS"                                             
##  [9] "REACTOME_OPIOID_SIGNALLING"                                              
## [10] "REACTOME_ORGANELLE_BIOGENESIS_AND_MAINTENANCE"
react_top10_down = react_DAC[ES < 0][head(order(pval), 10), pathway]
cat("Top ten down-regulated pathways:\n")
## Top ten down-regulated pathways:
print(react_top10_down)
##  [1] "REACTOME_DNA_REPLICATION_PRE_INITIATION"                                  
##  [2] "REACTOME_HDACS_DEACETYLATE_HISTONES"                                      
##  [3] "REACTOME_G2_M_CHECKPOINTS"                                                
##  [4] "REACTOME_HCMV_LATE_EVENTS"                                                
##  [5] "REACTOME_HATS_ACETYLATE_HISTONES"                                         
##  [6] "REACTOME_DNA_DOUBLE_STRAND_BREAK_REPAIR"                                  
##  [7] "REACTOME_HOMOLOGY_DIRECTED_REPAIR"                                        
##  [8] "REACTOME_S_PHASE"                                                         
##  [9] "REACTOME_DEPOSITION_OF_NEW_CENPA_CONTAINING_NUCLEOSOMES_AT_THE_CENTROMERE"
## [10] "REACTOME_TRANSCRIPTIONAL_REGULATION_OF_GRANULOPOIESIS"
plotEnrichment(react_pathways[["REACTOME_HDACS_DEACETYLATE_HISTONES"]], rank_DAC) +
            labs(title = "REACTOME_HDACS_DEACETYLATE_HISTONES genes")

Interactive figure for exploration

Interactive figures with a dynamic data table.

See Glimma vignette for instruction.

glimmaMA(ddsTxi, groups = ddsTxi$condition)
## 0 of 17757 genes were filtered out in DESeq2 tests

Conclusion

Together with accompanying shell scripts, we demonstrated the process of obtaining Salmon quantification count of selected bulk RNA-seq reads, and using the count matrix for differential expression and pathway analysis. The count matrix we processed was about 2/3 of that from NCBI raw count, but overall the results were consistent.

Resources

Love MI, et al. Analyzing RNA-seq data with DESeq2. https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html

Akalin, A. Computational Genomics with R, Chapter 8, RNA-seq analysis. https://compgenomr.github.io/book/rnaseqanalysis.html

Gu Y, et al. Biomedical Knowledge Mining using GOSemSim and clusterProfiler. Part II: Enrichment analysis. https://yulab-smu.top/biomedical-knowledge-mining-book/enrichment-overview.html

Session info

sessionInfo()
## R version 4.3.2 (2023-10-31 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 11 x64 (build 26100)
## 
## Matrix products: default
## 
## 
## locale:
## [1] LC_COLLATE=English_United States.utf8 
## [2] LC_CTYPE=English_United States.utf8   
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.utf8    
## 
## time zone: America/New_York
## tzcode source: internal
## 
## attached base packages:
## [1] grid      stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] plotly_4.10.4               ggvenn_0.1.10              
##  [3] Glimma_2.12.0               RColorBrewer_1.1-3         
##  [5] ggplot2_3.5.1               stringr_1.5.1              
##  [7] readr_2.1.5                 dplyr_1.1.4                
##  [9] pheatmap_1.0.12             hexbin_1.28.5              
## [11] vsn_3.70.0                  ReactomePA_1.46.0          
## [13] msigdbr_7.5.1               fgsea_1.28.0               
## [15] clusterProfiler_4.10.1      limma_3.58.1               
## [17] EDASeq_2.36.0               ShortRead_1.60.0           
## [19] GenomicAlignments_1.38.2    Rsamtools_2.18.0           
## [21] Biostrings_2.70.3           XVector_0.42.0             
## [23] BiocParallel_1.36.0         apeglm_1.24.0              
## [25] tximport_1.30.0             DESeq2_1.42.1              
## [27] SummarizedExperiment_1.32.0 MatrixGenerics_1.14.0      
## [29] matrixStats_1.4.1           EnsDb.Hsapiens.v86_2.99.0  
## [31] ensembldb_2.26.0            AnnotationFilter_1.26.0    
## [33] GenomicFeatures_1.54.4      GenomicRanges_1.54.1       
## [35] GenomeInfoDb_1.38.8         org.Hs.eg.db_3.18.0        
## [37] AnnotationDbi_1.64.1        IRanges_2.36.0             
## [39] S4Vectors_0.40.2            Biobase_2.62.0             
## [41] BiocGenerics_0.48.1        
## 
## loaded via a namespace (and not attached):
##   [1] splines_4.3.2           BiocIO_1.12.0           bitops_1.0-9           
##   [4] ggplotify_0.1.2         filelock_1.0.3          tibble_3.2.1           
##   [7] R.oo_1.27.0             polyclip_1.10-7         preprocessCore_1.64.0  
##  [10] graph_1.80.0            XML_3.99-0.17           lifecycle_1.0.4        
##  [13] edgeR_4.0.16            vroom_1.6.5             lattice_0.22-6         
##  [16] MASS_7.3-60             crosstalk_1.2.1         magrittr_2.0.3         
##  [19] sass_0.4.9              rmarkdown_2.29          jquerylib_0.1.4        
##  [22] yaml_2.3.10             cowplot_1.1.3           DBI_1.2.3              
##  [25] abind_1.4-8             zlibbioc_1.48.2         purrr_1.0.2            
##  [28] R.utils_2.12.3          ggraph_2.2.1            RCurl_1.98-1.16        
##  [31] yulab.utils_0.1.8       tweenr_2.0.3            rappdirs_0.3.3         
##  [34] GenomeInfoDbData_1.2.11 enrichplot_1.22.0       ggrepel_0.9.6          
##  [37] tidytree_0.4.6          reactome.db_1.86.2      codetools_0.2-20       
##  [40] DelayedArray_0.28.0     DOSE_3.28.2             xml2_1.3.6             
##  [43] ggforce_0.4.2           tidyselect_1.2.1        aplot_0.2.4            
##  [46] farver_2.1.2            viridis_0.6.5           BiocFileCache_2.10.2   
##  [49] jsonlite_1.8.9          tidygraph_1.3.1         bbmle_1.0.25.1         
##  [52] tools_4.3.2             progress_1.2.3          treeio_1.26.0          
##  [55] snow_0.4-4              Rcpp_1.0.13-1           glue_1.7.0             
##  [58] gridExtra_2.3           SparseArray_1.2.4       xfun_0.49              
##  [61] qvalue_2.34.0           withr_3.0.2             numDeriv_2016.8-1.1    
##  [64] BiocManager_1.30.25     fastmap_1.2.0           latticeExtra_0.6-30    
##  [67] digest_0.6.37           R6_2.5.1                gridGraphics_0.5-1     
##  [70] colorspace_2.1-0        GO.db_3.18.0            jpeg_0.1-10            
##  [73] biomaRt_2.58.2          RSQLite_2.3.9           R.methodsS3_1.8.2      
##  [76] tidyr_1.3.1             generics_0.1.3          data.table_1.16.4      
##  [79] rtracklayer_1.62.0      htmlwidgets_1.6.4       prettyunits_1.2.0      
##  [82] graphlayouts_1.2.1      httr_1.4.7              S4Arrays_1.2.1         
##  [85] scatterpie_0.2.4        graphite_1.48.0         pkgconfig_2.0.3        
##  [88] gtable_0.3.6            blob_1.2.4              hwriter_1.3.2.1        
##  [91] shadowtext_0.1.4        htmltools_0.5.8.1       ProtGenerics_1.34.0    
##  [94] scales_1.3.0            png_0.1-8               ggfun_0.1.8            
##  [97] knitr_1.49              rstudioapi_0.17.1       tzdb_0.4.0             
## [100] reshape2_1.4.4          rjson_0.2.23            nlme_3.1-166           
## [103] coda_0.19-4.1           curl_6.0.1              bdsmatrix_1.3-7        
## [106] cachem_1.1.0            parallel_4.3.2          HDO.db_0.99.1          
## [109] restfulr_0.0.15         pillar_1.10.0           vctrs_0.6.5            
## [112] dbplyr_2.5.0            evaluate_1.0.1          mvtnorm_1.3-2          
## [115] cli_3.6.2               locfit_1.5-9.10         compiler_4.3.2         
## [118] rlang_1.1.3             crayon_1.5.3            labeling_0.4.3         
## [121] interp_1.1-6            aroma.light_3.32.0      emdbook_1.3.13         
## [124] affy_1.80.0             plyr_1.8.9              fs_1.6.5               
## [127] stringi_1.8.4           viridisLite_0.4.2       deldir_2.0-4           
## [130] babelgene_22.9          munsell_0.5.1           lazyeval_0.2.2         
## [133] GOSemSim_2.28.1         Matrix_1.6-1.1          patchwork_1.3.0        
## [136] hms_1.1.3               bit64_4.5.2             KEGGREST_1.42.0        
## [139] statmod_1.5.0           igraph_2.1.2            memoise_2.0.1          
## [142] affyio_1.72.0           bslib_0.8.0             ggtree_3.10.1          
## [145] fastmatch_1.1-6         bit_4.5.0.1             gson_0.1.0             
## [148] ape_5.8-1